P 2 P Routing of Range Queries in Skewed Multidimensional Data Sets ?
نویسندگان
چکیده
We present a middleware to store multidimensional data sets on Internet-scale distributed systems and to efficiently perform range queries on them. Our structured overlay network SONAR (Structured Overlay Network with Arbitrary Range queries) puts keys which are adjacent in the key space on logically adjacent nodes in the overlay and is thereby able to process multidimensional range queries with a single logarithmic data lookup and local forwarding. The specified ranges may have arbitrary shapes like rectangles, circles, spheres or polygons. Empirical results demonstrate the routing performance of SONAR on several data sets, ranging from real-world data to artificially constructed worst case distributions. We study the quality of SONAR’s routing structure which is based on local knowledge only and measure the indegree of the overlay nodes to find potential hot spots in the overlay. We show that SONAR’s routing table is self-adjusting, even under extreme situations, keeping always a maximum of dlogNe routing entries.
منابع مشابه
Multidimensional Range Query Processing in Structured P2P Overlays
The Multi-Ring Content Addressable Network (RCAN) [1] is a dynamic distributed index structure for efficient management of large multidimensional data sets over peer-to-peer systems. RCAN propose a new self-organizing overlay topology that achieves logarithmic routing performance and effective load balancing while minimizing the maintenance overhead during nodes join and departure. In this pape...
متن کاملSkipTree: A Scalable Range-Queryable Distributed Data Structure for Multidimensional Data
This paper presents the SkipTree, a new balanced, distributed data structure for storing data with multidimensional keys in a peer-topeer network. The SkipTree supports range queries as well as single point queries which are routed in O(log n) hops. SkipTree is fully decentralized with each node being connected to O(log n) other nodes. The memory usage for maintaining the links at each node is ...
متن کاملOptimizing Physical Design of Multidimensional Files for Join Queries
Optimally organizing multidimensional data is NP-hard. The little work that has been done in optimising multidimensional data was limited to uniform data distribution and rarely considered the probability of use of each query. And those who did consider the probability of use of each query, they were limited to either partial match query or range query. This work shows that by combining heurist...
متن کاملManycore processing of repeated range queries over massive moving objects observations
The ability to timely process significant amounts of continuously updated spatial data is mandatory for an increasing number of applications. Parallelism enables such applications to face this data-intensive challenge and allows the devised systems to feature low latency and high scalability. In this paper we focus on a specific data-intensive problem, concerning the repeated processing of huge...
متن کاملCost-based query-adaptive clustering for multidimensional objects with spatial extents. (Groupement d'Objets Multidimensionnels Etendus avec un Modèle de Coût Adaptatif aux Requêtes)
We propose a cost-based query-adaptive clustering solution for multidimen-sional objects with spatial extents to speed-up execution of spatial range queries (e.g.,intersection, containment). Our work was motivated by the emergence of many SDIapplications (Selective Dissemination of Information) bringing out new real challengesfor the multidimensional data indexing. Our clusterin...
متن کامل